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Algorithmic  Aspects  of  Perfect  Graphs 

by 
Martin  Charles  Golumbic 

Consider  a  collection  C  =  {c.}  of  courses  being  offered 

1 

by  a  major  university.   Let  T.  be  the  time  interval  during 
which  course  c  is  to  take  place.   We  would  like  to  assign 
courses  to  classrooms  so  that  no  two  courses  meet  in  the 
same  room  at  the  same  time. 

This  problem  can  be  solved  by  properly  coloring  the 
vertices  of  the  graph  G  =  (C,E)  where  c .  c  .  e  E  *>  T^  n  t.  7^  0. 
We  may  interpret  each  color  as  corresponding  to  a  different 
classroom.   The  graph  G  is  an  interval  graph,  since  it  is 
represented  by  intersecting  time  intervals. 

This  example  is  especially  interesting  because  efficient, 
linear-time  algorithms  are  known  for  coloring  interval  graphs 
with  a  minimum  number  of  colors.   (The  minimum  coloring 
problem  is  NP-complete  for  general  graphs.) 

In  this  paper  we  will  survey  a  number  of  topics  in 
algorithmic  graph  theory  which  involve  classes  of  perfect 
graphs.   We  will  also  discuss  some  recent  applications  of 
perfect  graphs  to  computer  science.   The  intention  of  this 
article  is  to  provide  an  understanding  of  the  main  research 
directions  which  have  been  investigated  and  to  suggest  possible 
new  areas  of  research.   The  sections  of  this  paper  are  numbered 
to  correspond  with  the  chapters  of  the  author's  book 
"Algorithmic  Graph  Theory  and  Perfect  Graphs".   The  interested 
reader  is  referred  to  this  book  for  further  study. 
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2.   The  Design  of  Efficient  Algorithms 

Algorithmic  complexity  analysis  deals  with  the  quantita- 
tive aspects  of  problem  solving.   It  addresses  the  issue  of 
what  can  be  computed  within  a  practical       or  reasonable    amount 
of  time  and  space  by  measuring  the  resource  requirements 
exactly  or  by  obtaining  upper  and  lower  bounds  for  them. 
Complexity  is  actually  determined  at  three  levels:  the  problem, 
the  algorithm,  and  the  implementation.   Naturally,  we  want  the 
best  algorithm  which  solves  our  problem,  and  we  want  to  choose 
the  best  implementation  of  that  algorithm. 

Consider  the  problem  of  determining  whether  an  undirected 

graph  G  is  connected.   A  mathematically  elegant  solution  is 
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the  following:   G  xs  connected  if  and  only  if  I+M  +  M  +M 

+  •  •  •  +  M    has  no  zero  entries  where  M  is  the  adjacency 
matrix  of  G,  I  is  the  identity  matrix,  and  n  is  the  number  of 
vertices  of  G.   However,  using  this  theorem  as  an  algorithm 
would  require  much  more  work  (matrix  multiplication  and  addi- 
tion) than  is  actually  needed  to  test  connectivity.   A  better 
way  would  be  to  traverse  the  edges  of  the  graph.   The  following 
algorithm  will  test  connectivity  and  find  a  spanning  tree 
efficiently. 

Standard  Spanning  Tree  Algorithm  (SST) 

Step  I ;   Start  with  a  tree  T  consisting  of  one  arbitrary 
vertex  and  no  edges. 

Step  II:   If  T  contains  all  the  vertices  of  G,  then  STOP 
[T  is  a  spanning  tree].   Otherwise,  do  step  III. 

Step  III;   Add  to  T  an  edge  (x,y)  which  joins  a  vertex  y 
not  yet  in  T  to  a  vertex  x  already  in  T.   If  no  such  edge  exists, 
then  STOP  [there  is  no  spanning  tree;  G  is  not  connected]. 
Otherwise,  go  to  step  II. 

In  Step  III  of  our  algorithm  there  may  be  several  edges 
(x,y)  eligible  to  be  added  to  T.   We  call  such  an  edge  a 
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candidate    edge.   Various  priorities  can  be  established  to 
guide  the  choice  of  candidates,  and  each  priority  will  yield 
a  slightly  different  algorithm.   If  candidates  are  stored  in 
a  queue,  then  SST  gives  a  breadth-first  search  (BFS)  of  G. 
Storing  candidates  in  a  stack  SST  does  a  depth-first  search 
(DFS) .   If  the  edges  have  costs  associated  with  them,  and  if 
the  candidate  with  minimum  cost  is  always  chosen,  then  SST 
produces  a  minimum  cost  spanning  tree  (MST) .   Similarly, 
shortest  path  algorithms  and  critical  path  algorithms  can 
also  be  designed  by  adapting  SST  with  a  suitable  priority 
for  choosing  candidates. 

The  complexity  of  the  spanning  tree  algorithm  depends 
on  how  the  graph  is  stored  and  whether  anything  special  is 
done  to  the  candidate  edges.   The  table  below  summarizes 
these  complexities. 


Complexity  of  the  Standard  Spanning  Tree  Algorithm 


Candidates 
in 
a 


Adjacency  Matrix 
Stored  as  an 
Array 


Adjacency  Sets 
Stored  as  Lists 
or  Sequentially 


Stack 
(DFS) 

Queue 
(BFS) 

Reverse  Heap 
(MST) 


O(n^) 
0(n2) 


0(n  +  e  log  e) 


0(n  +  e) 


0(n  +  e) 


0(n  +  e  log  e) 


A  graph  problem  is  said  to  be  linear    in  the  size  of  the 
graph  if  it  has  an  algorithm  which  can  be  implemented  to  run 
in  0(n  +  e)  steps  on  a  graph  with  n  vertices  and  e  edges. 
Thus  testing  connectivity  is  a  linear  graph  problem.  This 
is  usually  the  best  that  one  could  expect  for  any  nontrivial 
graph  problem  since  every  vertex  and  every  edge  would  probably 
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have  to  be  examined  at  least  once.   A  problem  is  called 
polynomial    if  it  has  an  algorithm  which  can  run  in  o(p(n)) 
steps  where  p  is  a  polynomial  function. 

The  algorithmic  graph  problems  that  we  will  examine  in 
this  survey  paper  include  recognizing  various  classes  of 
perfect  graphs  and  finding  minimum  colorings,  minimum  clique 
covers,  maximum  cliques,  and  maximum  stable  sets.   We  will 
be  particularly  interested  in  special  purpose  polynomial 
algorithms  designed  to  solve  these  problems  for  particular 
classes  of  perfect  graphs.   The  reason   such  algorithms 
are  important  is  that  for  arbitrary  graphs   these  last  four 
problems  are  NP-complete.   That  is,  they  are  in  a  large  class 
of  problems  which  all  currently  require  an  exponential  amount 
of  running  time   and  which  are  all  related  in  such  a  way  that 
if  any  one  of  them  could  be  solved  in  polynomial  time,  then 
so  could  all  problems  in  this  class. 


3.   Perfect  Graphs 

An  undirected  graph  G  =  (V,E)  is  perfect    if  it  satisfies 
any  of  the  following  equivalent  conditions: 


u)(G^)  =  x(G^)  (for  all  A  £  v) 

a{G^)  =  9(G^)  (for  all  A  c  v) 

w(G^)  a(G^)  >  |Ai  (for  all  A  C  v) 

The  equivalence  of  (P,)  -  (P  )  is  known  as  the  Perfect  Graph 
Theorem. 

An  open  question  whose  solution  has  eluded  researchers 

for  two  decades  is  to  prove  or  disprove  the  following  con- 
jecture of  Claude  Berge. 


(Pj) 

(P^) 

(P3) 
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strong  Perfect  Graph  Conjecture  (SPGC) .   An  undirected 
graph  G  is  perfect  if  and  only  if  in  G  and  in  G  every  odd 
cycle  of  length  >^  5   has  a  chord. 

Although  proving  the  SPGC  seems  to  be  a  mathematical  rather 
than  an  algorithmic  problem  it  does  raise  an  interesting 
algorithmic  question. 

Is  there  a  polynomial  algorithm  which  recognizes  whether 
or  not  an  undirected  graph  G  has  an  odd  chordless  cycle  of 
length  >_  5? 

We  have  no  answer  to  this  question.   However,  if  there  is 
such  an  algorithm  and  if  the  SPGC  is  true,  then  it  would 
answer  another  open  question. 

Is  there  a  polynomial  algorithm  which  recognizes  whether 
or  not  an  undirected  graph  G  is  perfect? 

In  a  very  recent  paper  Grotschel,  Lovasz,  and  Schrijver 
[1980]  have  shown  that  the  ellipsoid  method  of  solving  linear 
programming  problems  can  be  applied  to  obtain  a  polynomial 
algorithm  to  find  maximum  stable  sets  and  minimum  colorings 
for  perfect  graphs.   Also,  since  G  is  perfect  if  and  only  if 
its  complement  G  is  perfect,  this  same  approach  can  be  used 
to  find  maximum  cliques  and  minimum  clique  covers.   The  major 
importance  of  this  result  is  that  it  generalizes  what  had 
been  known  for  certain  classes  of  perfect  graphs.   Although 
the  complexity  of  the  algorithm  is  polynomial,  it  may  not  be 
practical  to  implement.   As  the  authors  point  out,  it  is  not 
intended  to  compete  with  the   special  purpose  algorithms 
designed  to  solve  these  problems  for  interval  graphs,  compar- 
ability  graphs,  triangulated  graphs,  and  other  classes  of 
perfect  graphs  which  so  often  arise  in  applications. 
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4.   Triangulated  Graphs 

An  undirected  graph  G  is  called  triangulated    if  every 
cycle  of  length  strictly  greater  than  3  possesses  a  chord, 
that  is,  an  edge  joining  two  nonconsecutive  vertices  of  the 
cycle.   In  the  literature,  triangulated  graphs  have  also 
been  called  chordal ,    rigid-circuit,    monotone    transitive 
and  perfect    elimination    graphs. 

A  vertex  x  of  G  is  called  simplicial    if  its  adjacency 
set  Adj (x)   induces  a  complete  subgraph  of  G,  i.e.,  Adj (x) 
is  a  clique  (not  necessarily  maximal).   Dirac  [1961],  and 
later  Lekkerkerker   and  Boland  [1962]  ,  proved  that  a 
triangulated  graph  always  has  a  simplicial  vertex  (in  fact 
at   least  two  of  them) ,  and  using  this  fact  Fulkerson  and 
Gross  [1965]  suggested  an  iterative  procedure  to  recognize 
triangulated  graphs  based  on  this  and  the  hereditary  property, 
Namely,  repeatedly    locate    a    simplicial     vertex    and    eliminate 
it    from    the    graph,     until    either    no    vertices    remain    and    the 
graph    is    triangulated    or    at    some    stage    no    simplicial     vertex 
exists    and    the    graph    is    not    triangulated .       The  correctness 
of  this  procedure  is  given  in  Theorem  4.1.   Let  us  state 
things  more  algebraically. 

Let  G  =  (V,E)  be  an  undirected  graph  and  let 

o   =    [v, ,y„,...,v  ]   be  an  ordering  of  the  vertices.  We  say 

that   a  is  a  perfect    vertex    elimination    scheme     (or  perfect 

scheme)    if  each  v.  is  a  simplicial  vertex  of  the  induced 

subgraph  Gr  -..In  other  words,  each  set 

1 V  .  ,  .  .  .  ,  V  / 

^i  "  ■'■^j  ^  ^^^  ^^i^  I  J  >  i> 

is  complete.   For  example,  the  graph  G,  in  Figure  4.1  has  a 
perfect  vertex  elimination  scheme   a  =  [a,g,b, f ,c , e,d] . 
It  is  not  unique;  in  fact  G,  has  96  different  perfect  elimina- 
tion schemes.   In  contrast  to  this,  the  graph  G^    has  no 
simplicial  vertex,  so  we  cannot  even  start  constructing  a 
perfect  scheme  —  it  has  none. 
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Algorithm  4.1.   Maximum  cardinality  search. 

Input;    The  adjacency  sets  of  an  undirected  graph  G  =  (V,E) . 
Output ;   An  ordering  a   of  the  vertices. 

Method:   The  vertices  are  numbered  from  n  to  1  in  the  order  that 
they  are  selected  in  line  3.   This  numbering  fixes  the 
positions  of  an  elimination  scheme  a.   For  each  unnumbered 
vertex  x,  the  label    of  x  will  consist  of  the  number  of 
numbered  vertices  adjacent  to  x.   The  vertices  can  then  be 
ordered  according  to  their  labels.   Ties  are  broken  arbitrarily. 
The  algorithm  is  as  follows: 

1.  assign  the  label  0  to  each  vertex; 

2.  for      i  -<-  n  to      1  by      -    1      do 

3.  select:   pick  an  unnumbered  vertex  v  with  largest  label; 

4.  a(i)  -^   v;  [this  assigns  to  v  the  number  i] 

5.  update:  for   each  unnumbered  vertex  w  e  Adj (v)  do 

6.  add  1  to  label  (w) ;  end 
end 

The  fact  that  maximum  cardinality  search  can  be  used  to 
recognize  triangulated  graphs  is  demonstrated  by  the  next 
theorem. 

Theorem  4.3.    An  undirected  graph  G  =  (V,E)  is 
triangulated  if  and  only  if  the  ordering  a  produced  by 
Algorithm  4.1   is  a  perfect  vertex  elimination  scheme. 

Proof.    If  G  has  only  1  vertex,  then  the  proof  is  trivial. 
Assume  that  the  theorem  is  true  for  all  graphs  with  fev/er 
than  n  vertices  and  let  a  be  the  ordering  produced  by 
Algorithm  4.1   when  applied  to  a  triangulated  graph  G. 
By  induction,  it  is  sufficient  to  show  that  v  =  a(l)  is 
a  simplicial  vertex  of  G. 
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CLAIM;   G  may  not  contain  a  chordless  path 
y  =  [u, V,  , Vj,  •  •  • /V,  ,w]   with   k  >_  1   satisfying  the  property 

o~-^(v.)    <    a"-'-(u)  <  a""'-(w)   for  all  i  .  (1) 

Suppose  G  contains  such  a  path  y,  and  choose   y  such 
that  a   (u)  is  largest  possible.   Since   u  was  numbered 

before  v,  and  since  v,  ,  but  not  u,  is  adjacent  to  w, 

-1-1 
there  must   be  some  vertex   x    such  that   a   (u)  <  a   (x) 

which  is  adjacent  to  u  but  not  to  v,  .   Let  j  be  the  largest 

index  such  that  x  is  adjacent  to  v.  where  we  let  v„  =  u. 

Then  the  path  y'  =  [x,v . , . . . , v,  ,w]  must  be  chordless, 

since  its  only  possible  chord  xw  would  give  a  chordless 

cycle  of  length  ^4.   If   a   (x)  <  a   (w) ,  then  y'  would 

satisfy  (1)  and  contradict  the  maximality  of  a   (u)  since 

a"  (v,)  <  a"  (u)  <  a"  (x)  .   So  it  must  be  that  o~  (v;)  <  a~  (x) 

But  this  implies  that  y"  =  [w,v,  . . . , v .  ,x]  satisfies  (1)  and 

also  contradicts  the  maximality.   It  follows  that  no  such 

path  y  can  exist  in  G,  which  proves  the  claim. 

Now  let  V  =  a   (1)   and  suppose  that  v  is  not  simplicial. 

Choose   u,w  s  Adj (v)   with   uw  ^  E   so  that   a   (u)  <  a   (w) . 

Then  the  path  [u,v,w]  satisfies  (1),  which  contradicts 

the  claim.   Therefore,  v  is  simplicial  and,  by  induction, 

a  is  a  perfect  elimination  scheme.   The  converse  follows 

from  Theorem  4.1.  D 

The  complexity  of  Algorithm  4.1  is  linear  in  the  size 
of  G.   One  such  efficient  implementation  is  the  following. 
Let  S.  be  the  set  of  unnumbered   vertices  whose  label  is  i, 
and  let  S.  be  represented  by  a  doubly  linked  list.   For  each 
vertex  we  store  its  label  i  and  a  pointer  to  its  position  in 
the  set  S. .   When  a  vertex  v  is  numbered  it  is  removed 
from  its  set,  and  we  move  each  adjacent  vertex  w  up  by  one 
set;  this  can  be  executed  in  0(1  +  degree (v) )  steps.  Thus, 
the  entire  algorithm  will  be  0(|v[  +  |e|). 
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In  order  to  use  MCS  to  recognize  triangulated  graphs, 
we  need  an  efficient  method  to  test  whether  or  not  a  given 
ordering  o   of  the  vertices  is  a  perfect  elimination  scheme. 
Such  an  algorithm  is  given  in  Rose,  Tarjan,  and  Lueker  [1976] 
and  has  complexity  0(|v|  +  |e|).   (See  also  Golumbic  [198  0, 
pp.  88-91].) 


Fast  Algorithms  for  the  Coloring,  Clique,  Stable  Set 
and  Clique  Cover  Problems  on  Triangulated  Graphs 

Let  G  =  (V,E)  be  a  triangulated  graph,  and  let  a  be  a 
perfect  elimination  for  G  .   It  was  first  pointed  out  by 
Fulkerson  and  Gross  [1965]   that  every  maximal  clique  was 
of  the  form  {v}  u  a  where 

A  =  {x  e  Adj  (v)  1  o~'^  iv)    <    o~-^  (x)  }    . 

However,  some  of  these  sets  {v}  u  a   will  not  be 
maximal,  and  we  would  like  to  filter  them  out.   This  can  be 
accomplished  in  order  to  find  the  chromatic  number  and 
maximal  cliques  of  a  triangulated  graph  in  0(|vl  +  |e|)  time. 

The  problem  of  finding  the  stability  number  a(G)  of 
a  triangulated  graph  and  a  clique  cover  of  size  a (G)  is 
solved  by  Gavril  [1972] .   A  linear  implementation  of  his  algo- 
algorithm  can  be  obtained  by  using  techniques  of  Rose, 
Tarjan  and  Lueker  [1976]  . 

Let  a  be  a  perfect  elimination  scheme  for  G  =  (V,E). 
We  define  inductively  a  sequence  of  vertices  Y-,  >Yy  i  -  •  '  tYi.    in 
the  following  manner:   y,  =  a(l);  y.  is  the  first  vertex 

in  a   which  follows  y.  ,  and  which  is  not  in 

^  1-1 

A   u  A   u  . . .  u  A      ;  all  vertices  following  y ,  are  in 

Yl    Yn  yi-i  t 

A^-"  u  ..f  u  A^  .  Hence,  V  =  (y.  ,yn , .  . .  ,y^.  1  u  a^  u  ...  u  a^  , 

^1         .   -^t  -L   ^        L       y^  y^ 

The  following  theorem  applies. 
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Theorem  4.4   (Gavril  [1972]).   The  set  (y-,  ,72  /  •  •  ♦  f  Yo.) 
is  a  maximum  stable  set  of  G,  and  the  collection  of  sets 
Y.  =  (y-)  ^   Ay   (i  =  l,2,...,t)  comprises  a  minimum 
clique  cover  of  G. 

Proof.   The  set  iy-,  tYy ,  .  .  .  ,y.}      is  stable  since  if 

y.y.  G  E   for  j  <  i,  then  y.  G  a   which  cannot  be.  Thus, 

a(G)  ^  t.   On  the  other  hand,     -'  each  of  the  sets 

Y.  =  {y-}  ^  A        is  a  clique,  and  so  {y, , . . . ,Y  }  is  a 
1      1      y^^  i-  u 

clique  cover  of  G.   Thus,  a(G)  =  9(G)  =  t,  and  we  have 
produced  the  desired  maximum  stable  set  and  minimum  clique 
cover.  n 
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5.   Comparability  Graphs 

An  undirected  graph   G  =  (V,E)  is  a  comparabi lity    graph 
if  there  exists  an  orientation  (V,F)  of  G  satisfying 

F  n  f""*"  =  0  ,      F  +  f""""  =  E  ,      F^  C  F  , 

2  —1 

where  F   =  {ac  |  ab,bc  e  F  for  some  vertex  b}  and  F    is 

the  reversal  of  F.   The  relation  F  is  a  strict  partial 
ordering  of  V  whose  comparability  relation  is  exactly  E, 
and  F  is  called   a  transitive    orientation    of  G  (or  of  E) . 
Comparability  graphs  are  also  known  as  transi  tively    orientable 
graphs  and  partially    orderable    graphs.   Examples  of  some 
comparability  graphs  can  be  found  in  Figure  5.1. 

Let  us  see  what  happens  when  we  try  to  assign  a  transitive 
orientation  to  the  4-cycle  (Figure  5.2a).   Arbitrarily  choos- 
ing ab  e  F  forces    us  to  orient  the  bottom  edge  toward  b  and 
the  top  edge  toward  d   (for  otherwise  transitivity  would  be 
violated) .   These  in  turn   force  the  remaining  edge  to  be 
oriented  toward  d.   Applying  the  same  idea  to  the  graph  in 
in  Figure  5.2b   we  find  that  a  contradiction  arises,  namely, 
choosing  ab  g  f  forces  successively  the  orientations 
cb,cd,cf ,ef ,bf ,ba.    This  graph  is  not  a  comparability  graph. 
We  now  make  the  notion  of  forcing  more  precise. 

Define  the  binary  relation  V   on  the  edges  of  an 
undirected  graph  G  =  (V,E)  as  follows: 


ab  r  a'b'   iff 


either  a  =  a'  and  bb '  ^  E 
or      b  =  b'  and  aa '  ^  E 


-13- 


The  A  Graph 


The  Suspension  Bridge  Graph 


Figure  5.1    Transitive  Orientations  of  Two  Comparability  Graphs 


I 


(a) 
Figure  5.2 


(b) 

Examples  of  Forcing.   The  arbitrary  choice  of 
ab  s  F  forces  the  other  indicated  orientations, 


Figure  5.3 


/ 
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We  say  that  ab  directly    forces    a'b'  whenever  ab  T    a'b'.  Since 
E  is  irreflexive,  ab  r  ab;  however,  ab  /  ba.   The  reader 
should  not  continue  until  he  is  convinced  of  this  fact. 

The  reflexive,  transitive  closure  T*   of  r  is  easily- 
shown  to  be  an  equivalence  relation  on  E  and  hence  partitions 
E  into  what  we  shall  call  the  implication    classes    of  G.  Thus 
edges  ab  and  cd  are  in  the  same  implication  class  if  and  only 
if  there  exists  a  sequence  of  edges 

ab  =  a^b-  r  a^b,  r  ...  F  a,  b,  =  cd  ,    with  k  >  0  . 
0  0     11  k  k  — 

Such  a  sequence  is  called  a  T-cbain    from  ab  to  cd,  and  we 
say  that  ab  (eventually)  forces   cd  whenever  ab  T*   cd. 

Examples.   The  graph  G  of  Figure  5.3  has  8  implication 
classes : 


A,  =  {ab}  ,  Ap  =  {cd} ,  A_.  =  {ac  ,ad,  ae}  ,  A.  =  {be  ,bd,be}  , 
A^  ={ba},  A_  ={dc},  A^  ={ca,da,ea},  A.  =  {cb,db,eb}. 

On   the   other  hand,   the  graph  in  Figure  5.2  b   has  only 
one  implication  class: 

A  =  {ab, cb, cd, cf ,ef ,bf ,ba ,bc ,dc, f c , f e , fb} . 

Let  A  be  an  implication  class  of  an  undirected  graph, 

-1 
G,  and  let  A  =  A  u  a   denote  the  symmetric  closure  of  A. 

It  can  be  shown  that  if  G  has  a  transitive  orientation  F, 

then  either  F  n  a  =  A   (F  completely  agrees  with  A)  or 

-1 
F  n  A  =  A     (F  completely  disagrees  with  A)  and,  in  either 

case,  AHA   =0.   The  converse  of  this  is  also  valid,  namely, 

if  A  n  A    =0  for  every  implication  class  A,  then  G  has  a 

transitive  orientation. 
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Remark .  Many  readers  may  wonder  whether  an  arbitrary  union 

of  implication  classes  F  =  u  a.  satisfying  F  f^  F    =0  and 

-1  i   """ 

F  +  F    =  E  is  necessarily  a  transitive  orientation  of  G. 

The  answer  is  no.   As  a  counterexample,  consider  a  triangle 

which  has  8=2   such  orientations   two  of  which  fail. to  be 

transitive. 

Methods  for  determining  the  exact  number  of  transitive 
orientations  t(G)  of  a  given  undirected  graph  G  have  been 
developed  by  Shevrin  and  Filippov  [1970]   and  Golumbic  [1977a] , 
and  a  characterization  of  uniquely  partially  orderable 
graphs  (i.e.,  t(G)  =  2)  is  given  in  Shevrin  and  Filippov  [1970] 
and  Trotter,  Moore  and  Sumner  [1976] .    These  results  and 
others  are  discussed  in  detail  in  Golumbic  [1980] . 

We  shall  now  describe  an  algorithm  for  calculating 
transitive  orientations  and  for  determining  whether  or  not 
a  graph  is  a  comparability  graph.   This  technique  is  a  modi- 
fication of  one  first  presented  by  Pnueli,  Lempel  and  Even 
[1971] .   A  discussion  of  its  computational   complexity  will 
follow. 

Let  G  =  (V,E)  be  an  undirected  graph.   A  partition  of 
the  edge  set  E  =  B,  +  B„  +  . . .  +  B,  is  called  a  G-decomposition 
of  E  if  B.  is  an  implication  class  of  B.  +  ...  +  B,  for  all 
i  =  l,2,...,k.   A  sequence  of  edges  [x-,y-,,  x^y^ ,  ...,  ^vYul 
is  called  a  decomposition    scheme    for  G  if  there  exists  a 
G-decomposition  E  =  B-,  +  Bp  +  ...  +  B,  satisfying  x.y.  e  B. 
for  all  i  =  l,2,...,k.   In  this  section  the  term  scheme 
will  always  mean  a  decomposition  scheme. 

For  a  given  G-decomposition  there  will  be  many  correspond- 
ing  schemes   (any  set  of  representatives  from  the  B.).  However, 
for  a  given  scheme  there  exists  exactly  one  corresponding 
G-decomposition.   A  scheme  and  G-decomposition  can  be  constructed 
by  the  following  procedure: 
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Algorithm  5.1.   Decpmposition  Algorithm 

Let  G  =  (V,E)  be  an  undireced  graph.   Initially  let  i  =  1 
and  E,  =  E. 

Step  I :       Arbitrarily  pick  an  edge  e.  =  x.y.  ^  E.. 

Step  II;      Enumerate  the  implication  class  B.  of  E .' 
containing  x.y.. 

Step  III:     Define  E.,,  =  E.  -  B.. 
£- 1+1     1     1 

Step  IV:      If  E.^,  =  0,  then  let  k  =  i  and  STOP; 

otherwise,  increase  i  by  1  and  go  back  to  step  I. 

Clearly,  the  Decomposition  Algorithm  yields  a  scheme 

[x,y,  ,  .  .  .  ,x,  y,  ]  and  corresponding  G-decomposition  B,  +  ...  +  B, 

for  any  undirected  graph  G.   Moreover,  if  y.x.  had  been 

chosen  instead  of  x.y.  for  some  i,  then  B.    would  replace  B. 

I-'  1  '        1  ^1 

in  the  G-decomposition.   Applying  the  algorithm  to  the  graph 
in  Figure  5.3,  the  scheme  [ac,bc,dc]    gives  the  G-decomposi- 
tion for  which   B,  =  A-  ,  B„  =  A.  +  A,   and  B-.  =  A2  •  In 
this  example  notice  that  although  ba  and  be  were  not  F -related 
in  the  original  graph,  once  B,  is  removed  they  become  F-related 
in  the  remaining  subgraph  and  their  implication  classes  merge. 
In  general,  it  can  be  shown  that  each  implication  class  of 
E.  ,  will  be  the  union  of  either  1  or  2  implication  classes 
of  E^. 

The  next  theorem  legitimizes  the  use  of  G-decompositions 
as  a  constructive  tool  for  deciding  whether  an  undirected  graph 
is  a  comparability  graph,  and  if  so,  producing  a  transitive 
orientation.   Proofs  of  this  theorem  can  be  found  in  Golumbic 
[1977a]  or  Golumbic  [1980] . 

Theorem  5.1   (The  TRO  Theorem)    Let  G  =  (V,E)  be  an 
undirected  graph  with  G-decomposition  E  =  B,  +  ...  +  B,  . 
The  following  statements  are  equivalent: 

(i)       G  =  (V,E)   is  a  comparability  graph; 

(ii)      A  n  A   =0   for  all  implication  classes  A  of  E; 

(iii)     B.  n  bT   =0  for  i  =  l,...,k. 
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Furthermore,  when  these  conditions  hold,  B,  +  ...  +  B, 

Ik 

is  a  transitive  orientation  of  E. 

By  combining  the  TRO  Theorem  with  the  Decomposition 
Algorithm,  we  obtain  an  algorithm  for  recognizing  comparability 
graphs  and  assigning  a  transitive  orientation. 

Algorithm  5.2.   TRO  Algorithm 

Input;     An  undirected  graph  G  =  (V,E) . 

Output :    A  transitive  orientation  F  of  edges  of  G  if  FLAG 
has  final  value  0,  or  a  message  that  G  is  not  a  comparability 
graph  if  FLAG  has  final  value  1. 

Method :    The  entire  algorithm  is  as  follows : 


initialize:      i  -<-  1 ;   E.  -^  E;   F  -<-  0;    FLAG  -f-  0; 

I:    arbitrarily  pick  an  edge   x.y.  G  E.; 

II:   enumerate  the  implication  class  B.  of  E.  containing  x.y.; 

'^  11  ^      i-'i 

if    B.     n    bT"*"    =    0    then 
1  1  '^ 


else 


add   B .    to   F ; 
1 

FLAG  -*-  1 ;    [G  is  not  a  comparability 

graph] ; 

III:  define  E.,,  <-  E .    -  B .  ; 
1+1     1     1 

IV:   if  E. , ,  =0  then 
1+1    ^ 

k  <-  i;  STOP       [F  is  a  transitive 

orientation  of  G] ; 
else 

i  -«-  i  +  1 ;  go    to    1; 


The  sequence  of  arbitrary  choices  made  in  line  I  of  the 
algorithm   determines  which  of  the  many  transitive  orienta- 
tions of  G  is  produced  by  the  algorithm.   A  different  scheme 
may  give  a  different  transitive  orientation.  But,  when  you 
try  out  a  few  different  schemes   you  will  notice  a  remarkable 
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phenomenon:   No  matter  how  the  arbitrary  choices  for  G  are 
made,  the  number  of  iterations  k  will  always  be  the  same. 
This  phenomenon  is  actually  true  for  any  graph  G.  A  character- 
ization of  the  underlying  mathematical  structure  which  causes 
it  is  given   in  Golumbic  [1977a,  1980]. 

A  more  detailed  version  of  Algorithms  5.1  and  5.2  will 
suggest  how  we  may  construct  a  G-decomposition  and  test 
transitive  orientability  of  an  undirected  graph  G  =  (V,E) 
in  0(6 |e|)  time  and  0(|v|  +  |e|)  space  where  6  is  the  maximum 
degree  of  a  vertex.   Let  G  =  (V,E)  be  an  undirected  graph 

with  vertices  v, ,v„,...,v  .   In  the  algorithm  below  we  use 

12'    '  n  ^ 

the  function 


0         if   V. V.  ^  E 

ID 

_,_,„„,.  ..    I   k         if   v.v.  has  been  assigned  to  B, 
CLASS (x,j)  =    ]  1  j  ^         k 

-k         if   v.v.  has  been  assigned  to  B,"-'- 
undefined  if   v.v.  e  E  has  not  yet  been 

assigned 
and  |CLASS(i,j)|  denotes  the  absolute  value  of  CLASS(i,j). 


Algorithm  5.3.   Decomposition  Algorithm  (detailed  version) 

Input ;     The  adjacency  sets  of  an  undirected  graph  G  =  (V,E) 

with  vertices  v, ,v„,...,v  . 

12'    '  n 

Output :    A  G-decomposition  of  the  graph  given  by  the  final 
value  of  CLASS,  and  a  variable  FLAG  which  is  0  if  the  graph 
is  a  comparability  graph  and  1  otherwise.  If  the  algorithm 
terminates  with  FLAG  equal  to  0,  then  a  transitive  orienta- 
tion of  G  is  obtained  by  combining  all  edges  having  positive 
CLASS. 

Method :    The  algorithm  proceeds  until  all  edges  have  been 
explored.   In  the  kth  iteration   an  unexplored  edge  is  placed 
in  B,   (its  CLASS  is  changed  to  k) .   Whenever  an  edge  is 
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placed  into  B,  it  is  explored  using  the  recursive  procedure 
of  Figure  5.4   by  adding  to  B,  those  edges  F-related  to  it 
in  the  graph  E,  .   (Notice  that  v. v.  s  e,  if  and  only  if 
either  | CLASS (i,j)|  equals  k  or  is  undefined  throughout  the 
kth   iteration.)   The  variable  FLAG  is  changed  from  0'  to  1 
the  first  time  a  B,  is  found  such  that  B,  i^  B,   7^  0.  At  that 
point  it  is  known  that  G  is  not  a  comparability  graph  (by 
Thoerem  5.1).   The  algorithm  is  as  follows. 


initialize:    k  -<-  0 ;   FLAG  -^  0  ; 
for    each  edge  v. v.  in  E  do 

if   CLASS (i,j)  is  undefined  then    do 
k  ^  k  +  1; 

CLASS (i,j)  ^  k;   CLASS (j,i)  ^  -k ; 
EXPLORE (i,j) ; 
end  ; 
end ; 
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procedure    EXPLORE (i , j ) : 
loop  1:      for   each  m  e  Adj(i)  such  that  [m  ^  Adj(j) 

or  I  CLASS (j ,m)  \     <    k]    do 
if   CLASS(i,m)  is  undefined  then    do 
CLASS  (i,in)  ^  k ;   CLASS  (m,i)  -^    -k  ; 
EXPLORE  (i  ,m)  ;       end 
else    if   CLASS{i,m)  =  -k  then    do 
CLASS  (i,m)  -^    k;   FLAG  -t-  1; 
EXPLORE (i,m) ;  end 
end    loop  1 
loop  2:       for   each  m  G  Adj(j)  such  that 

[m^Adj(i)  or  I  CLASS (i ,m)  |  <k]  do 
if    CLASS(m,j)  is  undefined  then    do 
CLASS  (m,j)  -f-  k;   CLASS  (j,m)  ■<-  -k; 
EXPLORE (m,j) ;   end 
else    if    CLASS(m,j)  =  -k  then    do 
CLASS  (m,j)  ^    k;   FLAG  <-    1; 
EXPLORE (m,j) ;   end 
end    loop  2 
return 
end 

Figure  5.4 
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Complexity  analysis:   We  begin  by  specifying  an  appropriate 
data  structure.   The  adjacency  sets  are  stored  as  linked  lists 
sorted  into  increasing  order.   The  element  of  the   list  Adj (i) 
which  represents   edge  v. v.  will  contain  j,  CLASS (i,j),  a 
pointer  to  CLASS(j,i),  and  a  pointer  to  the  next  element  on 
Adj (i) .   The  storage  requirement  for  this  data  structure  is 
0(|v|  +  |eI),  and  the  entire  initialization  of  the  data 
structure  can  be  accomplished  in  linear  time. 

The  crucial  fa  ctor  in  the  analysis  of  our  algorithm  is 
the  time  required  to  access  or  assign  the  CLASS  function. 
Consider  the  first  loop  of  EXPLORE ( i , j ) .   Two  temporary  pointers 
simultaneously  scan  Adj (i)  and  Adj(j)  looking  for  values  of  m 
which  satisfy  the  condition  in  the  for    statement.  This  loop 
can  be  executed  in  0(d.+  d.)  steps.   The  second  loop  is  done 
similarly,  hence  the  time  complexity  of  EXPLORE (i,j)  is 
0(d.  +d  .)  . 

In  the  main  program,  a  pointer  scans  each  adjacency  list 
successively  in  the  for    loop  implying  a  time  complexity  of 
0(|e|).   Finally,  the  algorithm  calls  EXPLORE  once  for  each 
edge  or  its  reversal   (both  if  their  implication  classes  are 
not  disjoint).   Therefore,  since 


I         (d.  +  d.)  =  0(61e|) 
v.v.GE         ^ 

it  follows  that  the  time   complexity  for  the  entire  algorithm 
(including  preprocessing  the  input)  is  at  most  0(6  |e|). 


Coloring  and  Other  Problems  on  Comparability  Graphs 

Suppose  that  G  is  a  comparability  graph,  and  let  F  be  a 
transitive  orientation  of  G.   A  height    function  h  can  be 
placed  on  V  as  follows:   h (v)  =  0  if   v  is  a  sink;  otherwise, 
h(v)  =  1  +  max  {h(w)  |  vw  g  F}.   The  height  function  can  be 
assigned  in  linear  time  using  a  recursive  depth-first  search. 
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and  it  is  a  proper  vertex  coloring  of  G.   The  number  of 
colors  used  will  be  equal  to  the  number  of  vertices  in  the 
longest  path  of  F,  and  since,  by  transitivity,  every  path 
in  F  corresponds  to  a  clique  of  G,  the  height  function  will 
yield  a  coloring  which   uses  exactly   aj(G)  colors  which 
is  the  best  possible.   Therefore,  from  the  transitive 
orientation  F  we  can  assign  a  minimum  coloring  to  G  using 
the  height  function  in  0(|v|  +  |e|)  steps,  and,  at  the  same 
time,  calculate  a  maximum  clique  of  G.   We  will  illustrate 
this  by  solving  the  more  general  problem  of  finding  a  maximum 
weighted  clique  of  a  comparability  graph. 

(If  all  vertices  have  the  same  weight,  then  the  problem 
is  reduced  to  the  usual  problem  of  finding  a  clique  of 
maximum  cardinality.)   In  general  the  maximum  weighted 
clique  problem  is  NP-complete,  but  when  restricted  to  compar- 
ability  graphs   it   becomes   tractable. 

Algorithm  5.4    Minimum  Coloring  and  Maximum  Weighted 

Clique  of  a  Comparability  Graph. 

Input:     The  adjacency  sets  of  a  transitive  orientation  F 
of  a  comparability  graph  G  -    (V,E)   and  a  weight  function 
w  defined  on   V. 

Output :    A  minimum  coloring  of  G  and  a  clique  K  of  G 
whose  weight  is  maximum. 

Method:    We  use  a  modification  of  the  height  calculation 
technique  employing  the  recursive  depth-first  search  procedure 
SEARCH  in  Figure  5.5.   To  each  vertex  v  we  associate  its  COLOR 
and  its  cumulative  weight  W{v)   which  equals  the  weight  of 
the  heaviest  path  from  v  to  some  sink.  A  pointer  is  assigned 
to  v  designating  its  successor  on  that  heaviest  path.  Once 
the  cumulative  weights  are  assigned  the  clique   K  is  calculated 
beginning  the  line  labeled  retrace.       The  algorithm  is  given 
in  the  form  of  a  procedure. 
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procedure    MAXWEIGHT  CLIQUE  (V,F) : 
for    all  V   &    V    do 

if   V  is  unsearched  then 
SEARCH (V) ; 
end 
retrace:  select    y  g  v  such  that  W(y)  =  max  {W(v)  |  v  s  V} ; 
K  *-    {y};   y  <-    POINTER(y); 
while    Y    ji^    h    do 

K  -!-  K  U  {y};   y  -f-  POINTER(y); 
end 

return    K; 
end 

We  conclude  with  an  interesting  polynomial- time  method 
for  finding  a(G),  the  size  of  the  largest  stable  set  of  a 
comparability  graph  G.   We  transform  a  transitive   orienta- 
tion (V,F)  of  G  into  a  transportation  network  by  adding  two 
new  vertices  s  and  t  and  edges  sx  and  yt  for  each  source  x 
and  sink  y  of  F.   Assigning  a  lower  capacity  of   1   to  each 
vertex,  we  initialize  a  compatible  integer-valued  flow  and 
then  call  a  minimum-flow  algorithm.   The  valvie  of  the 
minimum  flow  will  equal  the  size  of  the  smallest  covering 
of  the   vertices  by  cliques  which  in  turn  will  equal  the 
size  of  the  largest  independent  set  since  every  comparability 
graph  is  perfect.   Such  a  minimum  flow  algorithm  can  run  in 
polynomial  time. 
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procedure    SEARCH (v) : 

if    Adj (v)  =  0  then    do 

W(v)  =  w(v);   POINTER(v)  ^  A;   COLOR(v)  *-    0;  end 
else    do 

for    all  X  G  Adj (v)  do 

if   X  is  unsearched  then 
SEARCH  ix)  ;       end 
select    y  e  Ad  j  (v)  such  that  W(y)  =  inax{W  (x)  |  xGAd j  (v)  }  ; 
W(v)  -<-   w(v)  +  W(y)  ;   POINTER(v)  ^  y; 
select    z   G  Adj (v)  such  that  COLOR (z) =max{ COLOR (z)  | 

z  G  Adj  (v)  }; 
COLOR(v)  ^  1  +  COLOR(z); 
end 
re  turn 
end 

Figure  5.5 
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6.   Split  Graphs 

An  undirected  graph  G  -    (V,E)  is  a  split    graph    if  there 
is  a  partition  V  =  S  +  K   of  its  vertex  set  into  a  stable 
set  S  and  a  complete  set  K.   Since  a  stable  set  of  G  is 
a  complete  set  of  the  complement   G  and  vice  versa,  G  is 
a  split  graph  if  and  only  if  its  complement  G  is  a  split 
graph.   Foldes  and  Hammer  [1977]  have  given  the  following 
characterization  of  split  graphs. 

Theorem  6.1.    Let  G  be  an  undirected  graph.  The  following 
conditions   are  equivalent: 
(i)         G  is  a  split  graph 
(ii)        G  and  G  are  triangulated  graphs 

(iii)       G  contains  no  induced  subgraph  isomorphic  to  2K2,  C. 
or  Cc. 

An  alternate  characterization  of  split  graphs  in  terms  of 
degree  sequences  is  the  following  result  of  Hammer  and  Simeone 
[1977] . 

Theorem  6.2.    Let  G  =  (V,E)  be  an  undirected  graph  with 
degree  sequence   d,  >_  d„  ^  .  .  .  ^  d   ,  and  let  m  =  max{  i  |  d .  >  i-l} 
Then,  G  is  a  split  graph  if  and  only  if 

m  n 

I      d      =  m{m   -    1)    +        I        d 
i=l   ^  i=m+l   ^ 

Furthermore,  if  this  is  the  case,  the  m  vertices  of  largest 
degree  will  be   a  maximum  complete  set  of  G. 

A  simple  recognition  algorithm  for  split  graphs  can  be 
designed  by  applying  Theorem  6.2.   If  this  is  done,  it  can 
easily  be  seen  that  the  complexity  of  recognizing  split 
graphs  is  0(n  log  n) .   The  same  complexity  applies  for  the 
clique  problem  and  the  stable  set  problem  on  split  graphs. 
However,  the  Hamiltonian  circuit  problem  on  split  graphs 
is  NP-complete. 
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7.   Permutation  Graphs 

Let  TT  =  [tt  ,t:  ,  •  •  •  ,Tr  ]  be  a  permutation  of  the  numbers 
l,2,...,n.   We  define  the  undirected  graph  G[it]  =  (V,E)  as 
follows : 


and 


V  =^^1'^2 ^n^ 


(V^,Vj)   G  E    iff    (i-j)  {TT~^  -  TT^)   <   0   . 


Two  vertices  are  joined  by  an  edge  if  they  occur  out  of 
their  proper  order  reading  the  sequence  ir  left  to  right 
(see  Figure  7.1).   v. 


Figure  7.1.   The  Graph  G  [4, 1,3, 5, 2] 


If  we  reverse    the  sequence  it,  each  pair  of  numbers 
which  occur   in  the  correct  order  in  tt  will  now  be  in  the 
wrong  order,  and  vice  versa.   Thus,  the  permutation  graph 
we  obtain  will  be  the  complement  of  G[Tr].   This  shows  that 
the  complement  of  a  permutation  graph  is  also  a  permutation 
graph. 

Another  property  of  the  graph  G[7t]   is  that  it  is 
transitively  orientable.   If  we  orient  each  edge  toward 
its  larger  endpoint,  then  we  will  obtain  a  transitive 
orientation  F.   For,  suppose  (v., v.)  ep   and  (v-;,v,  )  g  f, 

1  -1      1    "1  J    JC 


-1 


-1 


then  i  <  j  <  k  and  tt  .  "^  >  tt  .  "^  >  irrl   ,  which  implies 

that  (v.  ,v,  )  G  F.  This   is  only   half  of  the  story;   we 

actually  have  the  following  result  of  Pneuli,  Lempel,  and 
Even  [1971] . 
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Theorem  7.1.    An  undirected  graph  G  is  a  permutation 
graph  if  and  only  if  G  and  G  are  comparability  graphs. 

Theorem  7.1  suggests  an  algorithm  for  recognizing 
permutation  graphs,  namely,  applying  the  transitive  orienta- 
tion algorithm  to  the  graph  and  to  its  complement.  If  we 
succeed   in  finding  transitive  orientations,  then  the  graph 
is  a  permutation  graph.   To  find  a  suitable  permutation  we 
can  follow  the  construction  procedure  in  the  proof  of  the 

theorem,  which  can  be  found  in  Golumbic  [1980]  .   The 

3  2 

entire   method  requires   0(n  )  time  and  0(n  )  space. 


Permutation  graphs  are  useful  in  a  number  of  applica- 
tions   (Even,  Pnueli,  and  Lempel  [1972],  Tarjan  [1972], 
Golumbic  [1980] ) .   Of  particular  interest  in  this  context 
is  the  following  very  efficient  coloring   algorithm  for  G[7t] 


Algorithm  7.1.   Coloring  a  Permtuation  Graph 

Input :     A  permutation  t:  =  [ir-,  ,  tt^  ,  .  .  .  ,  it  ]  of  the  numbers 
{1 , 2 , . . . ,n} . 

Output:    A  coloring  of  the  vertices  G  [it]  and  the  chromatic 
number  x  of  G  [tt]  . 

Method ;    The  vertices  of  G[it]  are  assigned  colors  in  the 
order  it,  ,  it_  ,  .  .  .  ,  ir   ,  although  the  graph  itself  is  never 
actually  calculated.   A  counter  k  will  keep  track  of  the 
total  number  of  colors  used  so  far,  and  an  array  LAST(c) 
will  contain  the  number  of  the  vertex  which  v;as  the  last 
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to  receive  color  c.   During  the  jth  time  through  the  loop 

smallest  colo] 
The  entire  algorithm  is  as  follows 


we  color  tt  .  with  the  smallest  color  q  satisfying  it.  >_  LAST(q) 


1.  initialize: 

2.  loop: 
3. 
4. 
5. 
6. 


7. 


procedure 

k   -<-  0;  for    i-f-l  to    n  do    LAST(i) 
for    j  -s-  1  to  n  do 

m  -f-  min  {q  |  tt  .  >_   LAST(q)}; 

COLOR  (tt  .  )  -f-  m; 

LAST  (m)  ^  TT  .  ; 

k  -<-  max{k  ,m}  ; 
end    loop 
X  ^  k; 
end 


0 ;  end 


Example .  Let  us  illustrate  Algorithm  7.1  on  the  permuta- 
tion TT  -  [4,1,3,5,2],  After  the  initializations  in  line  1  the 
following  assignments  will  be  made  in  the  loop: 


J  -  1 

m  -*-  1 
COLOR (4)  ^ 1 
LAST(l)  ^  4 

k  ^  1 


J  ^  2 

m  -«-  2 
COLOR (1)  ^  2 
LAST (2)  ^  1 

k  ^  2 


j  ^  3 

m  -^  2 
COLOR (3)  ^  2 
LAST (2)  ^  3 

k  ^  2 


j  -  4 

m  -<-  2 
COLOR  (5)  <-  2 
LAST (2)  ^  5 

k  -  2 


m 


COLOR (2)  ^  3 

LAST (3)  ^  2 

k  ^  3 


Thus  the  chromatic  number  of  G[tt]  is  3  and  a  3-coloring  has  been 
assigned. 

The  complexity  of  Algorithm  7.1  is  0(n  log  x)  if  line  3 
is  implemented  using  binary  search.   A  proof  of  the  correct- 
ness of  this  algorithm  can  be  found  in  Golumbic  [1981] .   Algo- 
rithm 7.1   can  be  used   to  color   any  permutation  graph   G 
in   0(n  log  n)  time  provided  we  are  given  the  permutation  tt 
and  the  isomorphism  G  -^  G[tt]  .   If  we  do  not  have  tt,  then  we 
would  use  Algorithm  5.4. 
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8.   Interval  Graphs 

An  undirected  graph  G  is  called  an  interval    graph    if 
its  vertices  can  be  put  into  one-to-one  correspondence  with 
a  set  of  intervals  I   of  a  linearly  ordered  set  (like  the 
real  line)  such  that  two  vertices  are  connected  by  an  edge 
of  G  if  and  only  if  their  corresponding  intervals  have 
nonempty  intersection.   We  call  I   an  interval    representa- 
tion   for  G.   (It  is  unimportant  whether  we  use  open  intervals 
or  closed  intervals;  the  resulting  class  of  graphs  will  be 
the  same . ) 

The  following  characterization  of  interval  graphs  is 
due  to  Gilmore  and  Hoffman  [1964]. 

Theorem  8.1.   An  undirected  graph  G  is  an  interval  graph 
if  and  only  if  G  is  a  triangulated  graph  and  its  complement 
G  is  a  comparability  graph. 

The  coloring,  clique,  stable  set,  and  clique  cover 
problems  can  be  solved  in  polynomial  time  for  interval  graphs 
by  using  the  algorithms  of  Sections  4  or  5 ,  and  a  recognition 
algorithm  could  be  obtained  by  combining  the  algorithms  for 
triangulated  graphs  and  comparability  graphs.  However,  the 
recognition  algorithm  presented  in  Booth  and  Lueker  [1976] 
is  asymptotically  more  efficient.   They  have  shown  that  a 
data  structure  called  a  PQ-tree  can  be  used  to  obtain  a 
linear  algorithm. 

Interval  graphs  have  become  particularly  useful  mathe- 
matical structures  for  modeling  real  world  problems.  The  line, 
on  which  the  intervals  rest,  may  represent  anything  that  is 
normally  regarded  as  one-dimensional.   The  linearity  may  be 
due  to  physical    restriction    such  as  blemishes  on  a  micro- 
organism, speed  traps  on  a  highway,  or  files  in  sequential 
storage  in  a  computer.   It  may  arise  from  time    dependencies 
as  in  the  case  of  the  life  span  of  persons  or  cars,  or  jobs 
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on  a  fixed  time  schedule.   A  cost    function      may  be  the  reason 
as  with  the  approximate  worth  of  some  fine  wines  or  the 
potential  for  growth  of  a  portfolio  of  securities. 

The  task  to  be  performed  on  an  interval  graph  will 
vary  from  problem  to  problem.   If  what  is  required  is  to  find 
a  coloring  or  a  maximum  weighted  stable  set  or  a  large  clique, 
then  fast  algorithms  are  available.   If  a  Hamiltonian  circuit 
must  be  found,  then  there  are  no  known  efficient  algorithms 
(unless  the  graph  has  more  structure  than  just  being  an 
interval  graph) .   Also,  the  speed  with  which  such  a  problem 
can  be  solved  will  depend  partially  on  whether  we  are  given 
simply  the  interval  graph  G,  or,  in  addition,  an  interval 
representation  of  G. 

We  have  already  seen  one  application  of  interval  graphs 
in  the  opening  paragraph  of  this  article.   The  interested 
reader  is  referred  to  Roberts  [1976,  1978]  and  Golumbic  [1980] 
for  numerous  other  applications.   We  will  discuss  here  a 
recent  application  of  interval  graphs  to  optimal  macro  substi- 
tutions suggested  by  Golumbic,  Goss,  and  Dewar  [1980]. 

The  compiler  or  interpreter  for  a  microcomputer  system 
may  be  regarded  as  a  byte  sequence  which  resides  in  main 
memory.   Due  to  restrictions  on  the  size  of  main  memory,  it 
is  desirable  to  compact  this  byte  sequence.   One  technique 
is  to  define  a  set  of  macro  substitutions  which  allow  occur- 
rences  of  specified  byte  subsequences  to  be  replaced  by 
single  bytes.   The  subsequences  are  restored  dynamically  at 
run  time  by  use  of  an  associated  table. 

Figure  8.1  shows  a  sequence  of  hexidecim.al  digits  of 
length  36.   Since  the  digits  E  and  F  do  not  appear,  they  may 
be  used  to  indicate  macros.   Choosing  E  =  6A2  and  F  = 43B96 
the  original  sequence  may  be  reduced  to  length  20.   Notice 
that  when  two  macros  overlap,  only  one  can  be  replaced. 
This  overlapping  phenomenon,  therefore,  restricts  how  the 
macro  table  may  be  applied. 
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Original  Sequence  :  6A2C4  3B96  0D6  06A21C7  8  6A2  4  3B96A2  3C6A2  5 
Macro  Table:   E=6A2      F=43B96 


V 


OVERLAP 
Abbreviated  Sequence:   ECF0D60E1C78EFA23CE5 

Figure  8  . 1.  Macro  Substitution. 

The  problem  to  be  solved  is  to  choose  an  optimal  set  of 
macro  substitutions  and  an  order  for  performing  the  substi- 
tutions which  minimizes  the  total   length  of  the  byte 
sequence  and  associated  table.   Formally  we  require  the 
following . 

Input:     A  byte  sequence  B  of  length  n. 

Output:    A  set  of  m  macros      each  of  length  <_  k  and  an  order 
for    performing    the    substitutions         such  that  the 
total  length  of  the  abbreviated  sequence  and  macro 
table  is  minimized. 

The  reason  for  specifying  a  bound  on  the  length  of  the  macros 
is  that  in  practice  we  may  want  them  to  be  very  short  compared 
to  the  length  of  the  original  sequence. 

Notice  that  there  are  actually  two  aspects  to  the  problem: 

(1)  choosing  a  macro  set,   and 

(2)  using  the  macro  set  optimally. 

Let  B  =  <b,  ,bp,...,b  >   be  a  sequence  of  bytes  and  let 
k  be  a  fixed  constant.   The  length  of  B  is  denoted  by  |b|  =n. 
A  subsequence  <b.,...,b.>  of  B  is  denoted  by  B[i,j].  Clearly, 
|B[i,j]|  =  j-i+1.   The  weighted    interval    graph       G  =  (V,E,w) 
that  we  will  associate  with  B  is  defined  as  follows:  The  vertex 
set  V  consists  of  all  intervals  [i,j]  satisfying  l£j-i<k-l; 
two  vertices  v  =  [i,j]  and  u  =  [i',j']  are  connected  by  an 
edge  iff  they  intersect,  i.e.,  either  i'  <_j  ^j'  or  if.j'£j; 
the  weight  w(v)  of  a  vertex  v  =  [i,j]  is  equal  to  j-i  which 
represents  the  number  of  bytes  that  would  be  saved  by  replac- 
ing B[i,j]  by  a  single  byte. 
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It  is  easy  to  see  that  the  number  of  vertices  of  G  is 

slightly  less  than  kn  and  the  number  of  edges  is  less  than 

3 
but  on  the  order  of  k  n.   Furthermore,  the  graph  does  not 

actually  have  to  be  calculated  and  stored  since  any  query 

about  adjacency  of  vertices  can  be  answered  by  a  simple 

comparison  of  the  indices  of  their  corresponding 

subsequences. 

Let  M  be  a  subset  of  V  and  let 

B[M]  =  {B[i,j]  I  [i,j]  e  M}. 

We  may  think  of  B [M]  as  the  macro  table  generated  by  M.  To 
perform  the  macro  substitutions  we  would  find  all  occurrences 
of  these  macros  and  then  choose  a  subset  of  the  occurrences, 
no  two  of  which  intersect,  to  be  abbreviated.   Such  a  subset 
corresponds  precisely  to  a  stable  set  of  the  interval  graph  G. 
(Notice  that  this  model  does  not  permit  embedding  one  macro 
in  another  macro.)   Moreover,  to  make  the  abbreviated  sequence 
as  short  as  possible,  we  would  like  a  stable  set  whose  weight 
is  maximum.   (The  weight  of  a  subset  of  vertices  is  the  sum 
of  the  weights  of  its  members.)   This  method  is  summarized 
in  Figure  8.2. 

procedure    SUBSTITUTION (M) : 

C(M)  ^  {[i,j]  e  V  I  B[i,j]=B[i',j']  for  some  [i',j']eM}; 
X(M)  -f-  MAXIMUM  WEIGHTED  STABLE  SET  OF  THE 
INDUCED  SUBGRAPH  G. 


end 


C(M)  ' 

SAVINGS(M)  ^    I  w(u)  -  I      w(v) ; 

uex(M)         vSM 


Figure  8.2.   Finding  an  Optimal  Macro  Substitution 
for  a  Given  Set  of  Macros 


The  set  C (M)  consists  of  all  intervals  representing 
"candidate"  subsequences  which  may  be  replaced  using  the 
macro  table  B [M] .   Of  these  candidates  only  the  subsequences 
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represented  by  X(M)  will  be  replaced.   The  SAVINGS  is  calcu- 
lated by  summing  the  savings  obtained  for  each  macro  substi- 
tution and  subtracting  the  cost  of  storing  the  macro  table. 
Using  SUBSTITUTION   we  obtain  the  following  algorithm 
which  gives  an  optimal  solution  to  the  general  problem. 

Algorithm  8.1 

loop:     for    all  M  c  V  such  that  |m|  =  m  do 
call    SUBSTITUTION (M) ; 
end    loop 
return  the  M  and  X(M)  whose  SAVINGS (M)  is  maximum; 

The  number  of  passes  through  the  loop  in  Algorithm  8.1 

is  on  the  order  of  f   1   since  G  has  0(kn)  vertices.  (In 

^  m  ^ 

practice,  some  of  the  subsets  M  may  be  ruled  out  due  to 

other  criteria,  for  example,  by  requiring  that  macros  begin 

with  certain  designated  bytes.  This  would  lower  the  number 

of  passes.)   The  complexity  of  SUBSTITUTION  depends  on   how 

efficiently  we  are  able  to  find   C (M)  and  X(M)  for  a  given  M. 

Using  a  modification  of  the  deterministic  pattern  matching 

algorithm  of  Morris  and  Pratt  [1970],  C(M)  can  be  calculated 

in  0(m(k  +  n))  time.   See  also  Aho ,  Hopcroft  and  Ullman  [1976, 

Chapter  9] .   Since  a  maximum  stable  set  of  an  interval  graph 

G  =  (V,E)  may  be  found  in  time  0(|v|  +  |e|),  X(M)  can  be 

calculated  in  0(k  n)  time.    Hence,  we  conclude  that  the 

3 
worst  case  complexity  of  SUBSTITUTION  is  0(m(k+n)+k  n) 

and  the  worst  case  complexity  of  Algorithm  8.1  is 

m   ,  3 


0(c   ,  n    )   where   c   , 
ra,k  m,k 


ek 


m      r= 

' '   /2TTm 


which  is,   in  terms  of  the  length  of  the  input  sequence,  a 
polynomial  whose  degree  depends  on  the  constant  m. 
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Notice  that  our  model  has  not  allowed  the  embedding  of 
macros  in  other  macros.   A  reason  for  this  could  be  that  it 
is  impractical  to  implement  the  stack  necessary  to  allow 
embedding.   In  some  applications  one  may  choose  to  allow 
embedding.   If  this  is  the  case,  a  similar  model  can  be 
designed  which  uses  overlap  graphs  rather  than  interval 
graphs.   An  overlap    graph    is  the  same  as  an  interval  graph 
in  which  there  are  no  edges  between  pairs  of  vertices  whose 
corresponding  intervals  have  one  properly  contained  in  the 
other. 

Our  Algorithm  8.1  and  SUBSTITUTION  will  also  be  optimal 
using  the  overlap  graph  model.   Their  respective  complexities, 
in  this  case,  will  each  be  raised  by  one  power  of  kn .   This 
follows  from  the  fact  that  a  maximum  weighted  stable  set  of 
an  overlap  graph  G  ==  (V,E)  can  be  calculated   in  0(1v|*|e|) 
time,  (see  Gavril  [1973],  and  Golumbic  [1980,  Chapter  11]. 

The  problem  of  macro  substitution  was  recently  applied 
to  MICRO  SPITBOL  for  an  Incoterm  SPD20/40   supporting  64K 
of   main  memory.  The  byte  sequence  for  MICRO  SPITBOL  required 
23,110  bytes  of  storage.   There  were  176  unused  opcodes   which 
were  designated  to  represent  macros.   That  is,  n  =  23110 
and  m  =  176  and  we  set  k  =  20. 

Since  the  time  complexity  of  Algorithm  8.1  would  be 
high  for  this  application,  an  effective  technique  for  finding 
a  no-  ■  optimal  solution  was  needed.   A   combination  of 
heuristics  and  SUBSTITUTE   reduced  the  size  of  the  sequence 
to  17,920  bytes  and  produced  a  macro  table  of  962  bytes. 
This  represents  a  saving  of  4,228  bytes  of  main  storage, 
a  saving  of  20%.   It  should  be  pointed  out  that  an  increased 
cost  of  obtaining  a  very  good  macro  substitution  may  be 
justified  by  the  fact  that  this  is  done  only  once  per  compiler 
and  machine  and   the  result  presumably  will  be  used  many, 
many  times. 
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